112 research outputs found

    A Nonlinear Mixture Autoregressive Model For Speaker Verification

    Get PDF
    In this work, we apply a nonlinear mixture autoregressive (MixAR) model to supplant the Gaussian mixture model for speaker verification. MixAR is a statistical model that is a probabilistically weighted combination of components, each of which is an autoregressive filter in addition to a mean. The probabilistic mixing and the datadependent weights are responsible for the nonlinear nature of the model. Our experiments with synthetic as well as real speech data from standard speech corpora show that MixAR model outperforms GMM, especially under unseen noisy conditions. Moreover, MixAR did not require delta features and used 2.5x fewer parameters to achieve comparable or better performance as that of GMM using static as well as delta features. Also, MixAR suffered less from overitting issues than GMM when training data was sparse. However, MixAR performance deteriorated more quickly than that of GMM when evaluation data duration was reduced. This could pose limitations on the required minimum amount of evaluation data when using MixAR model for speaker verification

    Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction

    Full text link
    Speaker diarization (SD) is typically used with an automatic speech recognition (ASR) system to ascribe speaker labels to recognized words. The conventional approach reconciles outputs from independently optimized ASR and SD systems, where the SD system typically uses only acoustic information to identify the speakers in the audio stream. This approach can lead to speaker errors especially around speaker turns and regions of speaker overlap. In this paper, we propose a novel second-pass speaker error correction system using lexical information, leveraging the power of modern language models (LMs). Our experiments across multiple telephony datasets show that our approach is both effective and robust. Training and tuning only on the Fisher dataset, this error correction approach leads to relative word-level diarization error rate (WDER) reductions of 15-30% on three telephony datasets: RT03-CTS, Callhome American English and held-out portions of Fisher.Comment: Accepted at INTERSPEECH 202

    Device Directedness with Contextual Cues for Spoken Dialog Systems

    Full text link
    In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins. Following the success of pre-trained models, we use low-level speech representations from a self-supervised representation learning model for our downstream classification task. Further, we propose a novel technique to infuse lexical information directly into speech representations to improve the domain-specific language information implicitly learned during pre-training. Experiments conducted on spoken dialog data show that our proposed model trained to validate barge-in entirely from speech representations is faster by 38% relative and achieves 4.5% relative F1 score improvement over a baseline LSTM model that uses both audio and Automatic Speech Recognition (ASR) 1-best hypotheses. On top of this, our best proposed model with lexically infused representations along with contextual features provides a further relative improvement of 5.7% in the F1 score but only 22% faster than the baseline

    Peek into the Future Camera-based Occupant Sensing in Configurable Cabins for Autonomous Vehicles

    Full text link
    The development of fully autonomous vehicles (AVs) can potentially eliminate drivers and introduce unprecedented seating design. However, highly flexible seat configurations may lead to occupants' unconventional poses and actions. Understanding occupant behaviors and prioritize safety features become eye-catching topics in the AV research frontier. Visual sensors have the advantages of cost-efficiency and high-fidelity imaging and become more widely applied for in-car sensing purposes. Occlusion is one big concern for this type of system in crowded car cabins. It is important but largely unknown about how a visual-sensing framework will look like to support 2-D and 3-D human pose tracking towards highly configurable seats. As one of the first studies to touch this topic, we peek into the future camera-based sensing framework via a simulation experiment. Constructed representative car-cabin, seat layouts, and occupant sizes, camera coverage from different angles and positions is simulated and calculated. The comprehensive coverage data are synthesized through an optimization process to determine the camera layout and overall occupant coverage. The results show the needs and design of a different number of cameras to fully or partially cover all the occupants with changeable configurations of up to six seats.Comment: Conference: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) Link: https://ieeexplore.ieee.org/document/956442

    Pests and predators of oysters

    Get PDF
    In all aquaculture practices the detrimental effects of cohabiting organisms are either by predation, competition, disease or parasitism. Hanson (1974) stated that limited predation can serve to weed out some diseased members of a crop and also help in controlling epizootic infections. But large-scale mortalities result in economic loss by reduction in the tended stock. Control of predation also means additional expense on the production cost (Mackenzie, 1970a). While evolving culture methods for fish or shellfish, identifying and proper use of methods to prevent and control numerous predators of cultivable organisms is absolutely essential to maximise production

    Group III PLA2 from the scorpion, Mesobuthus tamulus: cloning and recombinant expression in E. coli

    Get PDF
    Phospholipases A2 (PLA2) are enzymes that specifically hydrolyze the sn-2 fatty acid acyl bond of phospholipids, producing a free fatty acid and a lyso-phospholipid. We report the cloning and expression of a secretory phospholipase A2 (sPLA2) from Mesobuthus tamulus, Indian red scorpion. The nucleotide sequence codes for a 167 residue enzyme. The open reading frame codes for a 31 amino acid signal peptide followed by a mature portion of the protein. The primary structure shows the calcium binding motif, catalytic residues, 8 highly-conserved cysteines and C-terminal extension which classify it as a group III PLA2. The entire transcript was expressed in Escherichia coli and was purified by metal affinity chromatography under denaturing conditions. The protein was refolded by serial dilutions in the refolding buffer to its active form. Hemolytic assays indicate that the protein adopts a functional conformation. The functional requisites such as optimum pH of 8 and calcium dependency are shown. This report provides a simple but robust methodology for recombinant expression of toxic proteins

    Identification of 4 New Loci Associated With Primary Hyperparathyroidism (PHPT) and a Polygenic Risk Score for PHPT

    Get PDF
    CONTEXT: A hypothesis-free genetic association analysis has not been reported for patients with primary hyperparathyroidism (PHPT). OBJECTIVE: We aimed to investigate genetic associations with PHPT using both genome-wide association study (GWAS) and candidate gene approaches. METHODS: A cross-sectional study was conducted among patients of European White ethnicity recruited in Tayside (Scotland, UK). Electronic medical records were used to identify PHPT cases and controls, and linked to genetic biobank data. Genetic associations were performed by logistic regression models and odds ratios (ORs). The combined effect of the genotypes was researched by genetic risk score (GRS) analysis. RESULTS: We identified 15 622 individuals for the GWAS that yielded 34 top single-nucleotide variations (formerly single-nucleotide polymorphisms), and LPAR3-rs147672681 reached genome-wide statistical significance (P = 1.2e-08). Using a more restricted PHPT definition, 8722 individuals with data on the GWAS-identified loci were found. Age- and sex-adjusted ORs for the effect alleles of SOX9-rs11656269, SLITRK5-rs185436526, and BCDIN3D-AS1-rs2045094 showed statistically significant increased risks (P < 1.5e-03). GRS analysis of 5482 individuals showed an OR of 2.51 (P = 1.6e-04), 3.78 (P = 4.0e-08), and 7.71 (P = 5.3e-17) for the second, third, and fourth quartiles, respectively, compared to the first, and there was a statistically significant linear trend across quartiles (P < 1.0e-04). Results were similar when stratifying by sex. CONCLUSION: Using genetic loci discovered in a GWAS of PHPT carried out in a Scottish population, this study suggests new evidence for the involvement of genetic variants at SOX9, SLITRK5, LPAR3, and BCDIN3D-AS1. It also suggests that male and female carriers of greater numbers of PHPT-risk alleles both have a statistically significant increased risk of PHPT
    corecore